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Abstract 

Background: Existing tools to model cell growth curves do not offer a flexible integrative approach to manage large 
datasets and automatically estimate parameters. Due to the increase of experimental time-series from microbiology 
and oncology, the need for a software that allows researchers to easily organize experimental data and 
simultaneously extract relevant parameters in an efficient way is crucial. 

Results: BGFit provides a web-based unified platform, where a rich set of dynamic models can be fitted to 
experimental time-series data, further allowing to efficiently manage the results in a structured and hierarchical way. 
The data managing system allows to organize projects, experiments and measurements data and also to define teams 
with different editing and viewing permission. Several dynamic and algebraic models are already implemented, such 
as polynomial regression, Gompertz, Baranyi, Logistic and Live Cell Fraction models and the user can add easily new 
models thus expanding current ones. 

Conclusions: BGFit allows users to easily manage their data and models in an integrated way, even if they are not 
familiar with databases or existing computational tools for parameter estimation. BGFit is designed with a flexible 
architecture that focus on extensibility and leverages free software with existing tools and methods, allowing to 
compare and evaluate different data modeling techniques. The application is described in the context of bacterial and 
tumor cells growth data fitting, but it is also applicable to any type of two-dimensional data, e.g. physical chemistry 
and macroeconomic time series, being fully scalable to high number of projects, data and model complexity. 



Background 

Modeling cell growth and estimating curve parameters 
from data is a common task in areas ranging from micro- 
biology to oncology. 

In microbiology studies, maximal growth rate and max- 
imal biomass are probably the two best studied bacterial 
growth properties. In a nutshell, these physiological prop- 
erties provide a rough reflection of how well a bacterium 
cell benefits from a particular set of nutrients. Thus, they 
can be used to guide a myriad of applications. One such 
example, is the utilization of growth rate maximization 
as an objective in constraint-based reconstruction and 
analysis of metabolic networks. 

In oncology, it is also crucial to model tumors growth 
and understand their dynamics under different internal 
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and external perturbations. For example, linking growth 
parameters with pharmacokinetic-pharmacodynamic 
(PK/PD) models can help predicting the responses of 
tumor dynamics when exposed to distinct drug regimes 
[1]. This knowledge can be further used to optimize 
the design of new experiments and support preclinical 
development of oncology drugs. 

Hence, it is relevant for experimentalists to extract 
these key parameters from curves in order to character- 
ize cell and tissue physiology, such as maximum growth 
rates, lag phase and asymptotic maximum OD. Further- 
more, it is also interesting to be able to compare fittings 
obtained with different models. Due to the development 
of high-throughput techniques, the amount of data being 
generated is growing fast, hindering their management 
in large collaborative projects and also hampering model 
identification procedures. 

Several dynamic models based on differential and alge- 
braic equations have already been proposed and are 
extensively used in these fields. These include sigmoid- 
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like curves such as Logistic, Gompertz, Richards, Schnute 
and Stannard [2,3]. The fitting of these curves to growth 
data is usually performed using in-house software or 
freely available tools. These include DMFit (available 
at http://www.ifr.ac.uk/safety/dmfit/) and GInaFiT [4], 
which offer an Excel add-in to model data according to 
several implemented dynamic models, along with pack- 
ages provided for software R, namely grofit [5] and cell- 
Growth in Bioconductor. Other tools such as MicroHibr 
(http://www.microhibro.com/) are available as a web- 
application but with limited functionality. Recently devel- 
oped databases such as ComBase [6] and LabBase [7] 
try to aggregate and store time-series of bacterial growth 
under several experimental conditions, serving as bench- 
marks for Predictive Microbiology. 

Although dynamic models and databases for biological 
growth data have now reached a mature state, there is 
still no easy-to-use software, to our knowledge, that allows 
experimentalists not familiar with computational tools to 
extract relevant parameters in an easy and automated way 
and simultaneously efficiently manage their data. This 
constitutes the main motivation for the development of 
BGFit, which further allows to integrate more sophisti- 
cated and complex models, both algebraic and differential, 
due to its flexible and expandable architecture. Its main 
utility is thus to provide a user-friendly web-service that 
couples database management with model inference, with 
expected applicability in several areas of research. The 
examples here provided (data available for illustrative pur- 
poses in the webpage) include Microbiology projects, with 
the estimation of bacterial growth curves under differ- 
ent sugars, and Oncology, where models for the time 
evolution of carcinoma weight are inferred. 

Implementation 

BGFit web-application serves both as: (i) an automated fit- 
ting tool for experimental data using an extensible set of 
dynamic models through a distributed architecture and; 
(ii) a data repository that stores and manages experimen- 
tal data. 

The data modeling features allow users to choose a 
dynamic model and estimate the parameters that best 
describe the dataset. With this information BGFit sim- 
ulates the estimated curve and presents the results in a 
chart along with the original dataset and goodness-of-fit 
measures. 

This automated process can be applied to single dataset, 
or to a collection that aggregates similar or complemen- 
tary data, such as replicates of an experiment. This pro- 
vides both a global view on aggregated data and a fine 
control on specific measurements. 

BGFit s repository of dynamic models allow users to 
apply their own models, as well as take advantage of an 
existing and expandable set of contributed models, each 



bestowing to a richer environment. With this functionality 
it is possible to compare the results of different fittings in a 
single dashboard. The models currently implemented are 
Baranyi, Gompertz, Logistic and Richards models [2,8], 
first and second order polynomial regression, exponen- 
tial decay, Lumry-Eyring - LENP type lb (ODE) [9] for 
modeling the kinetics of irreversible protein aggregation, 
Hyperbolastic growth model of type III (H3) [10] and Live 
Cell Fraction model [11,12]. To complement the dynamic 
modeling feature, users can also apply manual regression 
on the data, traditionally performed as a linear fitting in 
logarithmic scale. 

While not intended to be exhaustive, this list imple- 
ments a wide set of algebraic and differential models that 
are used in many areas or research and serves as a support 
for future expansions by users. 

The data-management features supports the modeling 
process and facilitates the collaboration by creating a 
central point of access. One of the motivations for this 
application is the need to have a better workflow for col- 
laboration, avoiding the exchange of files using traditional 
methods, such as emails and shared folders. Thus, BGFit 
features a hiearchical-based data storage where users can 
define their own teams and attribute read/write permis- 
sions accordingly. Additionally the public scope can also 
be defined, allowing to openly share and publish the data 
online. 

All the input data and results, such as the time series, 
estimated parameters, model simulations and charts, are 
available for direct download to further analysis. 

The entire source code for BGFit and the implemented 
models are available online, as well as the instruction to 
setup a fully functional installation locally. This addresses 
data confidentiality by allowing each laboratory to keep a 
local BGFit version for private projects. 

Architecture and data structure 

BGFit is developed using open-source frameworks and 
free libraries allowing for a high degree of flexibility and 
creating a modular system constituted by Ruby on Rails, 
MySQL, Octave, Mathjax and Google Chart Tools. 

The application is designed using a model-view-controller 
architecture effectively separating data-management and 
dynamic modeling that is performed using extensions 
that are decoupled from the web-application. 

The modelling extensions only require the implementa- 
tion of the necessary interface and for it to be deployed 
on a location that is accessible by BGFit. This approach 
allows for every component of BGFit to be deployed 
online, encouraging collaboration and the reutilization of 
these tools. It can also be used in a local installation while 
keeping the access to all the developed models. 

Input data is stored using a hierarchical-based orga- 
nization with three different layers. The top-level layer, 
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project, defines global properties for the project, such as 
user permissions and whether it is publicly available. The 
middle layer, experiments, aggregates the different results 
in folders. The bottom layer, measurements, is the users 
actual data and can store 3-dimensional annotated data, 
although only the first two dimensions are used in the 
modeling extensions for now (Figure 1). BGFit represents 
a central repository for data, models and fittings. 

Modelling extensions 

One of the strengths of BGFit is that it allows to eas- 
ily expand the dynamic models. Modeling the data in 
the application is performed through a REpresentational 
State Transfer (REST) web-service that receives a set of 
parameters as input and returns the functions result. 

The web-service should support two functions and a 
baseline for comparisons between different models, e.g., 
root mean square error (RMSE): 1) Parameter estimation - 
which takes the data-points, such as time series, a range 
set for the parameters and outputs the estimated param- 
eters using linear/nonlinear regression and 2) Model 
simulation - that receives a set of parameters for the model 
as input and returns a simulated curve. 

The modeling extension should implement these func- 
tions to be fully compliant. This approach forces a strict 
interface for communication, but on the other hand, it 
offers flexibility on the implementation of the model as it 
is technological agnostic. 

The necessary technical documents, templates and 
examples are fully described in the Model Blackbox 
public repository (https://github.com/averissimo/model_ 
blackbox), providing a starting point for users to create 
and implement their own interface-compliant models. 

The available templates offer two approaches imple- 
mented in Octave and Matlabs numerical computing 
environments, either as a script for Octave, making it 
possible to deploy the modeling extensions without any 



licensing issues, or as a standalone application for Matlab, 
taking advantage of SBToolbox2 [13] functions. 

Results and discussion 

In order to illustrate the organization of the data and how 
to retrieve the available information, we will exemplify the 
application of BGFit tool in two different projects: 1) bac- 
terial growth fitting and 2) tumor cell growth. The data 
is available at the webpage, along with all the necessary 
documentation, (see also Additional file 1) 

In Table 1 some of the models implemented in BGFit are 
shown. 

Figure 2 illustrates several features of BGFit. Panel A) 
represents a specific model overview webpage (Hyperbo- 
lastic growth), where the user can define the algebraic or 
differential equation, along with all the detailed descrip- 
tion regarding the parameters, such as the expected search 
range values and which of them represent initial condi- 
tions. The source code for estimation and simulation is 
automatically generated, thus expanding the model col- 
lection currently available. In this page the user can also 
download all the statistical data of the fittings performed. 
Panel B) shows one measurement fitted with different 
models, including a manual regression, allowing to com- 
pare them graphically and numerically. The simulations 
are plotted along with the original experimental data, fur- 
ther supporting visual inspection of the results. Panel C) 
illustrates the simultaneous estimation of different mea- 
surements of the same experiment. This allows to fit a 
model to several replicates, useful for finding an aver- 
age model for similar experimental conditions. Panel D) 
shows all the estimation results obtained for a given 
model, allowing the user to download the information in 
a CSV file. 

The results obtained by BGFit are in good agree- 
ment with previous estimation procedures and are made 
directly available at the webpage for further analysis. 




Figure 1 BGFit architecture. BGFit is designed using a model-view-controller architecture and the data is organized hierarchically into projects, 
experiments and measurements. Users can introduce their data through the web-based platform, while other collaborators independently fit any 
implemented model. BGFit was developed using Ruby on Rails web-framework, MySQL, Octave/MATLAB and Google Chart Tools. 
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Table 1 Some models implemented in BGFit 



Name 


Equation 


Reference 


Hyperbolastic growth type III (H3) 


P(t) = M-a exp(-8t y - arcsinh(0t)) # 
a = (M - P 0 ) exp(<5tj + arcsinh(6>t 0 )) 


[10] 


Live Cell Fraction 


% = r(t)/3 • ((a + 8) • 3 • X/(3 ■ A, + r(t)) - 5) 


[11,12] 




y(t) = y 0 + /W + ^ In (e^ r + e"^ - e -v-t-/»o) 




Baranyi 


m 111 1 1 e m(ymax-y 0 ) 1 


[8] 


Gompertz 


y(t) = >\ • exp {- exp (A - t) + l]} 


[2] 



A) 



Hyperbolastic growth model of type III H3 

Simulation URL 

http://orade.inesc-id.Pt/octave_models/5/solver.json 
Parameter estimation URL 

http://oracle.inesc-id.pt/octave_in odels/S/estimator.json 
Description 

Et>y et al. BMC Cancer 2010. 
Definition 

Pll) = M- acxp(-«5f 1 - arcsinhlOl)). a = (M-P») cKp(St( } +arcsinh(Ot a )) 
Listing 5 parameters 

Codo Human Description Bottom Top Output I.Cond. 



d delta growth 



0.0 1.0 false 



9 gamma growth 



0.0 5.0 false false edit delete 



M carrying capacity 



0.0 10.0 



0.0 2.0 false 



theta theta sigmoid 



Add new parameter 



Edit Generate s< 



B) 



Manual Regression Results 



Proxy Models (3) 



Space Params 



!49,v-0 626468, yO-3.24198S,ym_».l. 



goto download .csv edit . 



LOgiStiCS A.5.0S0S89.A.-0 S87761.1J max.-1.237906.Ln<N(0))-4. 185962 




■ Gompertz 

■ 28-6-201 1 (A) 



c) 



Proxy Models (2) 



Baranyi hO-4.185224,m-0.73897.p max..|.86l776.v-1.79800S.yO-2.860095.yma«.0.060Sl 



'.pmM.-l. 101901, Ln(N(0»--J. 



D) 



Live cell fraction model 



w results lor the live cell fraction model model. It on 



Description: 
Listing 4 parameters 



Gompertz 
28-6-2011 (A) 
28-6-2011 (B) 
7-7-2011 (B) 
7-4-2011 (2) 
7-4-2011 (1) 
7-7-2011 (A) 
18-1-2011(1) 
18-1-2011(2) 
2-3-2011 (A) 
2-12-2011 (A) 



Parameter Estimation Results 

Solid Tumor Growth - cell lines ; 



U118 (shoo eiper 



Rl 4.2421 0.20943 7.65275 7.71208 48.88053 1.00484 1.00484 




Figure 2 BGFit modeling examples. Screenshoots of BGFit application, illustrating different features of the tool. A) Model definition - example on 
the Hyperbolastic growth model of type III (H3) description. B) Single measurement fitting - comparison between several models for the same 
time-series growth data on bacterial growth curves. C) Aggregated experiment fitting - illustrated the simultaneous estimation of several 
measurements or replicates of the same experiment. D) Aggregated estimation results - shows the model page where the user can assess and 
download all the information for the estimated projects. 
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In particular, we have compared our results with pub- 
lished data and also with R software package grofit [5], to 
demonstrate the consistency of the obtained fittings. 

More specifically, we have used as a first testing set the 
time evolution of the weight of Solid Ehrlich Carcinoma 
treated with combined IAA and DMSO described in [10]. 
These data were fitted with a hyperbolastic growth model 
of type III (H3), proposed by Eby and colleagues. Regard- 
ing this comparison, BGFit returned similar results. The 
parameters reported in the original paper were (M = 
7.533, 8 = 3.594 • 1(T 9 , y = 4.712 and 0 = 0.004, 
RMSE=0.07264) and those obtained by BGFit were (M = 
7.547, 8 = 5.103 • 1(T 9 , y = 4.630 and 0 = 0.004, 
RMSE=0.066), showing an excellent agreement. BGFit 
estimation slightly improved the previous fitting, related 
with a better adjustment of the curve in the initial phase 
(see figures at http://kdbio.inesc-id.pt/bgfit/experiments/ 
113/measurements/270 for details). 

The second comparison performed was with the results 
obtained by state-of-the-art grofit package in R [5] . We 
have used the testing datasets provided upon installa- 
tion and compared BGFit with grofit results on Gompertz 
and Richards models (see all seven measurements 
at http://kdbio.inesc-id.pt/bgfit/projects/62/experiments/ 
146, along with figures and tables), thus demonstrating the 
consistency of our results. 

The obtained fittings and corresponding comparisons 
and simulation results are fully available at the webpage 
under projects Tumor Growth and Comparing BGFit with 
R respectively. 

Conclusions 

The main contributions of BGFit are delivering a plat- 
form for automated data modeling of large time-series 
dataset and providing a baseline for comparison between 
different models, either novel or already described in the 
literature. 

By designing the application based on a distributed 
architecture that separates heavy calculations from the 
data management and repository, computational load is 
distributed through different locations and network of 
models. This allows BGFit to scale as the userbase grows. 

BGFit supports collaborative projects by providing a 
central repository which can be used by several teams 
simultaneously, handling large experimental datasets 
through a clean and hierarchy-based organization of the 
data. BGFit allows users to implement and reuse an ever 
growing network of models, to improve validation of their 
methods, thus supporting model comparison and selec- 
tion procedures. 

BGFit is designed as a parameter estimation platform 
for any type of two-dimensional data. Despite being 
described in the context of cell growth data, the appli- 
cation can easily be used in other areas with different 



dynamic and algebraic models, such as physical chemistry 
and econometrics. 

Availability and requirements 

Project name: BGFit 

Project home page: http://kdbio.inesc-id.pt/bgfit 
Operating system(s): Platform independent 
Programming language: Ruby 
Other requirements: Ruby 1.9.3 or higher 
License: GNU GPL v2 

Any restrictions to use by non-academics: Only those 
imposed already by the license. 

Additional file 



Additional file 1 : BGFit complete documentation. BGFit application full 
user and technical documentation as in April 201 3 (for reference purposes). 
The most recent version is available at http://kdbio.inesc-id.pt/bgfit/ 
bgfit_documentation.pdf 
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